Data Visualization
In the following section, we can see different visualizations based on the gathered information. The sources from which the data was obtained can be found in the Data Sources section. The collected data was analyzed and data cleaning steps were applied. Above each visualization, the code for the data cleaning and the visualization building steps is available.
Below are the libraries used in this section:
Libraries
library(tidyverse)
library(ggplot2)
library(forecast)
library(astsa)
library(xts)
library(tseries)
library(fpp2)
library(fma)
library(lubridate)
library(TSstudio)
library(quantmod)
library(tidyquant)
library(plotly)
library(gridExtra)
library(readxl)
library(writexl)
library(imputeTS)
library(zoo)
library(knitr)
library(kableExtra)
library(patchwork)
library(vars)
library(dplyr)
library(tidyr)Function stocks_NAs()
stocks_NAs <- function(data_frame){
symbol <- max(data_frame$symbol)
# Select relevant columns
data_frame <- data.frame(data_frame$date, data_frame$adjusted)
# Rename columsn
names(data_frame) <- c("date", "adjusted")
# Create a sequence of dates from start_date to end_date
start_date <- as.Date(min(data_frame$date))
end_date <- as.Date(max(data_frame$date))
# Create data range
date_range <- seq(start_date, end_date, by = "1 day")
# Create a dataset with the date range
date_dataset <- data.frame(Date = date_range)
# Merge dataframes
data_frame <- merge(data_frame, date_dataset, by.x = "date", by.y = "Date", all = TRUE)
# Extract rows with missing values
df_na_rows <- data_frame[which(rowSums(is.na(data_frame)) > 0),]
# Extract columns with missing values
df_na_cols <- data_frame[, which(colSums(is.na(data_frame)) > 0)]
# Modify data
imputed_time_series <- na_ma(data_frame, k = 4, weighting = "exponential")
# Add modified data
data_frame <- data.frame(imputed_time_series)
# Change data type
data_frame$date <- as.Date(data_frame$date,format = "%m/%d/%y")
names(data_frame) <- c("date", symbol)
return(data_frame)
}Function merge_dataframes
merge_dataframes <- function(dataframes_list) {
# Perform inner join using the first data frame as the base
df_complete <- dataframes_list[[1]]
# Iterate over the remaining data frames and merge
for (i in 2:length(dataframes_list)) {
df_complete <- merge(df_complete, dataframes_list[[i]], by = "date", all.x = TRUE)
}
return(df_complete)
}Global Oil Production
The above plot is an animation created using Tableau software. The data in this visualization consists of the top 15 oil-producing countries for the year 2022. As the years progress, the colors of the countries change based on each country’s production values. The selected time period is from 2000 to 2022.
Based on the information from the visualization, from the year 2000 to 2011, the countries with less oil production were Kazakhstan and Qatar. On the other hand, the top 2 oil-producing countries were Saudi Arabia and Russia.
From the year 2011 to 2014, the United States had a significant increase in oil-production, making it really close to Saudi Arabia and Russia. By the year 2018, it had a production high above the previously mentioned countries making it the top 1. This situation continued until the year 2022. The rest of the countries remain with lower price values.
Data
# Import dataset
df_oil_production <- read_csv('./data/oil-production-by-country.csv')
# Rename the columns of the dataframe
names(df_oil_production) <- c('Country', 'Code', 'Year', 'Oil_production_TWh')
# Calculate the total oil production by countries for 2022.
top_countries <- df_oil_production %>%
filter(Year == 2022 & str_length(Code) == 3) %>%
group_by(Country) %>%
summarize(total_production = sum(Oil_production_TWh))
# Keep the top 10 countries with higher oil production for 2022
top_countries <- top_countries %>%
arrange(desc(total_production)) %>%
head(15)
# Filter dataframe by the top 10 oil producing countries
df_oil_production <- df_oil_production %>%
filter(Country %in% top_countries$Country)
# Save dataframe as a new file
write.csv(df_oil_production, './data/viz_oil_production.csv', row.names = FALSE)U.S. Oil Production
This project is primarily focused on the United States, and as such we are analyzing its oil production by month for the past 22 years.
From the plot, we can see that for the first 11 years, from 2000 to 2011, the monthly oil production was below 6 thousand barrels. There is an upward trend until the middle of 2015, when the production increased almost to 10 thousand barrels per month. For the next year, the production had a slight decrease, but the values were still higher compared to the first years of analysis.
In the following years, we can see that there was a big increase once again, reaching the 13 thousand barrels production, but due to the COVID pandemic, the production had a great fall, causing the increase trend to slow down.
Data
# Get oil price data from stock market
df_oil_production_us <- read_csv('./data/oil_production.csv')
# # Rename Columns
names(df_oil_production_us) <- c("Data")
# Filter rows
df_oil_production_us <- tail(df_oil_production_us, nrow(df_oil_production_us)-3)
# Separate Columns
df_oil_production_us <- df_oil_production_us %>%
separate('Data', into = c("Date", "Production"), sep = ";")
# Chage data type
df_oil_production_us$Date <- as.Date(df_oil_production_us$Date, format = "%d/%m/%Y")
# Change data type
df_oil_production_us$Production <- round(as.numeric(df_oil_production_us$Production), 2)
#Save dataframe as a new file
write.csv(df_oil_production_us, './data/viz_us_oil_production.csv', row.names = FALSE)Visualization
# Chage data type
df_oil_production_us <- df_oil_production_us %>% filter(year(Date) >= 2000 & year(Date) <= 2022)
# Create plot
viz_oil_production <- plot_ly(data = df_oil_production_us, x = df_oil_production_us$Date, y = df_oil_production_us$Production, type = 'bar') %>%
layout(
title = 'U.S. Oil Production',
xaxis = list(title = 'Date'),
yaxis = list(title = 'Production (thousand barrels)'),
showlegend = FALSE
)
# Show plot
ggplotly(viz_oil_production)Crude Oil Price
From 2003 to 2008, oil prices rose, quadrupling and setting numerous records. The 2007 subprime mortgage crisis, triggered by the collapse of the 2006 U.S. housing bubble, led to a severe global liquidity crisis and subsequent economic events, including the January 2008 stock market crisis, the October 2008 global stock market crisis, the international economic downturn, and the Great Recession.
In 2015, crude oil prices fell due to oversupply relative to global demand, resulting in steadily rising inventories throughout the year.
The COVID-19 pandemic caused a sharp drop in oil prices due to government-imposed shutdowns, stay-at-home orders, and travel restrictions. An unprecedented event occurred on April 20th, 2020, when oil prices turned negative. In March, an oil price war broke out between Russia and Saudi Arabia due to disagreements over production levels.
By the early summer of 2020, oil prices rebounded as nations lifted shutdowns, and OPEC agreed to substantial crude oil production cuts. Major central banks stopped raising interest rates, supporting demand. Towards the end of the year, optimism surrounded the planned launch of several COVID-19 vaccines, supporting the market.
Data
# Get oil price data from stock market
df_oil_price <- tq_get("CL=F", get = "stock.prices", from = "2000-01-01", to = "2022-12-31")
# Calculate adjusted price
df_oil_price$adjusted <- (df_oil_price$adjusted)/(df_oil_price$adjusted[1])
# Create dataframe
df_oil_price <- data.frame(df_oil_price,rownames(df_oil_price))
# Save dataframe as a new file
write.csv(df_oil_price, './data/viz_oil_price.csv', row.names = FALSE)Visualization
# Create plot
viz_oil_price <- plot_ly(df_oil_price, type = 'scatter', mode = 'lines')%>%
add_trace(x = df_oil_price$date, y = df_oil_price$adjusted, name = 'Price', showlegend = FALSE)%>%
layout(title = 'Crude Oil Price by Day',
xaxis = list(rangeslider = list(visible = T)))
options(warn = -1)
# Set plot layout
viz_oil_price <- viz_oil_price %>%
layout(
xaxis = list(zerolinecolor = '#ffff',
zerolinewidth = 2,
gridcolor = 'ffff'),
yaxis = list(zerolinecolor = '#ffff',
zerolinewidth = 2,
gridcolor = 'ffff',
title = 'Index'),
plot_bgcolor='#e5ecf6')
# Show plot
ggplotly(viz_oil_price)Inflation Rate
The bar chart above was created using the Plotly package for R. The data in this visualization consists of the annual percentage change in inflation for the U.S. from the year 2000 to 2022, plotted monthly. To enhance clarity, certain bars in the visualization have been highlighted to make the context explanations below easier to understand.
The Consumer Price Index (CPI), calculated by the Bureau of Labor Statistics, is a widely used measure of inflation. However, the Federal Reserve prefers the Personal Consumption Expenditures (PCE) index because it provides a clearer picture of inflation trends that are less affected by short-term price changes, such as food and energy. The Fed uses monetary policy to manage inflation over the economic cycles.
In 2000, there was a business cycle expansion of 4.1%, mainly due to the Tech Bubble burst. In 2001, inflation was 1.60% YOY, peaking in March due to President Bush’s tax cut and hitting a trough in November after the 9/11 attacks.
In 2007, inflation peaked at 4.1% in December in the midst of the banking crisis. In 2008, during the International Financial Crisis, YOY inflation was only 0.1%, coupled with a GDP contraction of -0.1%.
In 2014 and 2015, annual inflation rates of 0.80% and 0.90% respectively accompanied business cycle expansion and GDP growth. The end of the quantitative easing in 2014 and deflation in oil and gas prices in 2015 were key events.
In 2020, the COVID-19 pandemic triggered a global economic shutdown, resulting in a 1.4% inflation rate and a -3.4% GDP growth. In 2021, government recovery efforts led to a 5.9% GDP growth and 7% YOY inflation. The Fed responded by raising interest rates, leading to a high inflation rate of 6.50%, and business activity contraction in 2022. Russia’s invasion of Ukraine had a significant impact on the U.S. and global economies.
Data
# Import dataset
df_inflation <- read_csv('./data/inflation.csv')
# Filter information to keep United States total inflation rate by moth.
df_inflation <- df_inflation %>%
filter(LOCATION == "USA" & SUBJECT == "TOT" & FREQUENCY == "M" & MEASURE == "AGRWTH")
# Format the Time column
df_inflation$TIME <- paste(df_inflation$TIME, "-01", sep = "")
# Chage data type
df_inflation$TIME <- as.Date(df_inflation$TIME)
# Select relevant columns
df_inflation <- df_inflation %>% dplyr::select('TIME', 'Value')
# Save dataframe as a new file
write.csv(df_inflation, './data/viz_inflation.csv', row.names = FALSE)Visualization
# Filter data
df_inflation <- df_inflation %>% filter(TIME >= '2000-01-01' & TIME < '2023-01-01')
# Create plot
viz_inflation <- plot_ly(data = df_inflation, x = ~TIME, y = ~Value, type = 'bar',
marker = list(color = ifelse( grepl("^2000-", df_inflation$TIME) |
grepl("^2007-", df_inflation$TIME) |
grepl("^2014-", df_inflation$TIME) |
grepl("^2022-", df_inflation$TIME),
"red",
ifelse( grepl("^2015-", df_inflation$TIME) |
grepl("^2008-1", df_inflation$TIME)|
grepl("^2020-", df_inflation$TIME),
"green",
"lightblue")))) %>%
layout(
title = 'U.S. Inflation Variations by Month',
xaxis = list(title = 'Time'),
yaxis = list(title = 'Rate'),
showlegend = FALSE
)
# Show plot
ggplotly(viz_inflation)Interest Rate
The bar chart visualization above was created using the Plotly package for R. The data in this graph illustrates the annual interest rates for the U.S. from the year 2000 through 2021. Certain bars in the visualization have been highlighted in color red and green in order to show the different increases and decreases in the interest rates, which are explained below.
In 2000, after a prolonged period of economic expansion, the US Federal Reserve funds rate was 6.5%. The bursting of the technology bubble and the 9/11 attacks triggered a global economic slowdown. This led to 11 interest rate reductions, bringing it down to 1.75% by the end of the year.
In 2002-2003, the Fed continued to cut rates due to a weak recovery and low inflation. In 2005-2006, they raised rates 17 times to cool the economy and address the real estate bubble. In 2007-2008, the housing market crisis and rising unemployment prompted the Fed to cut rates from 4.75% to 2%.
By the end of 2008, during the Great Recession, rates were reduced to a range of 0 to 0.25%, with no inflationary pressures. Quantitative easing was introduced to stimulate the economy.
Seven years later, the Fed cautiously raised rates as the economy recovered, with the first hike coming in December 2015. Economic concerns from China in early 2016 and falling oil prices led to a year-long pause in rate hikes. In 2019, U.S.-China trade tensions prompted three rate cuts in the second half of the year to support the economy.
In 2020, the COVID-19 pandemic caused massive job losses and an unemployment rate of 14.7%, leading to two emergency rate cuts that returned rates to the 0-0.25% range.
Data
# Import dataset
df_interest <- read_excel('./data/interest_rates.xlsx', sheet = "Annual", .name_repair = "unique_quiet")
# Filter information to keep Central Back Policy Rate.
df_interest <- df_interest %>% filter(Indicator == 'Central Bank Policy Rate')
# Convert dataframe from wide to long
df_interest <- df_interest %>%
gather(key = "Year", value = "Value", -Indicator, -'...2', -'...3', -Scale)
# Select relevant columns
df_interest <- df_interest %>% dplyr::select('Year', 'Value')
# Change data type
df_interest$Value <- round(as.numeric(df_interest$Value), 2)
# Save dataframe as a new file
write.csv(df_interest, './data/viz_interest.csv', row.names = FALSE)Visualization
# Set color variables for each data point
color_2000 <- 'rgba(222,45,38,0.8)' # red
color_2001 <- 'rgba(204,204,204,1)' # gray
color_2002 <- 'rgba(210,238,130,1)' # green
color_2003 <- 'rgba(210,238,130,1)' # green
color_2004 <- 'rgba(204,204,204,1)' # gray
color_2005 <- 'rgba(222,45,38,0.8)' # red
color_2006 <- 'rgba(222,45,38,0.8)' # red
color_2007 <- 'rgba(222,45,38,0.8)' # red
color_2008 <- 'rgba(210,238,130,1)' # green
color_2009 <- 'rgba(204,204,204,1)' # gray
color_2010 <- 'rgba(204,204,204,1)' # gray
color_2011 <- 'rgba(204,204,204,1)' # gray
color_2012 <- 'rgba(204,204,204,1)' # gray
color_2013 <- 'rgba(204,204,204,1)' # gray
color_2014 <- 'rgba(204,204,204,1)' # gray
color_2015 <- 'rgba(222,45,38,0.8)' # red
color_2016 <- 'rgba(222,45,38,0.8)' # red
color_2017 <- 'rgba(204,204,204,1)' # gray
color_2018 <- 'rgba(204,204,204,1)' # gray
color_2019 <- 'rgba(222,45,38,0.8)' # red
color_2020 <- 'rgba(210,238,130,1)' # green
color_2021 <- 'rgba(204,204,204,1)' # gray
# Create plot
viz_interest <- plot_ly(df_interest, x = df_interest$Year, y = df_interest$Value, type = 'bar',
marker = list(color = c(color_2000, color_2001, color_2002, color_2003, color_2004,
color_2005, color_2006, color_2007, color_2008, color_2009,
color_2010, color_2011, color_2012, color_2013, color_2014,
color_2015, color_2016, color_2017, color_2018, color_2019,
color_2020, color_2021)),
text = df_interest$Value,
textposition = 'outside',
textangle = 0)
# Set de layout of the plot
viz_interest <- viz_interest %>% layout(title = "Interest Rate by Year",
xaxis = list(title = "Year"),
yaxis = list(title = "Rate"))
# Show plot
ggplotly(viz_interest)Food Industry
As it was introduced before, the food industry is very important for the analysis of oil production and prices on different industries. To analyze this sector we chose some of the main companies leading the Food and Beverages Sector.
In this case, we can see a clear upward trend that was directly affected by two events that caused a drop in the following quarters. The first occurred in 2008 due to the financial crisis, where the recovery of production was slower compared to the beginning of 2020, when the COVID pandemic took place. In this second event, the trends increased at a higher rate for some of the companies.
Further analysis will be carried and as a sample we are going to use Tyson Foods, Inc (TSN).
Data
df_list <- list()
start <- "2000-01-01"
end <- "2022-12-31"
tickers <- c("GIS", "KHC", "MDLZ", "TSN", "ADM")
for(i in tickers){
df <- tq_get(i, get = "stock.prices", from = start, to = end)
df <- stocks_NAs(df)
df_list <- append(df_list, list(df))
}
# Merge datasets
df_companies_food <- merge_dataframes(df_list)
# Save dataframe as a new file
write.csv(df_companies_food, './data/viz_food_companies.csv', row.names = FALSE)Visualization
df_companies_food <- gather(df_companies_food, key = "stock", value = "price", -date)
# Create ggplot line plot
viz_food_companies <- ggplot(df_companies_food, aes(x = date, y = price, color = stock)) +
geom_line() +
labs(title = "Stock Prices Over Time",
x = "Date",
y = "Stock Price") +
scale_color_discrete(name = "Stock")
# Show plot
ggplotly(viz_food_companies)Healthcare Industry
To analyze this sector we chose some of the main companies leading the Healthcare Index. This allows us to understand the performance, volatility and trends of the major companies in the sector.
During the first 13 years of analysis, we can see that the index is below the stock prices are below 100 usd per share, with very low volatility. In the following 7 years, Thermo Fisher Scientific Inc (TMO) and UnitedHealth Group Inc (UNH), improved their performance, reaching values higher than 200 usd per share. We can clearly see the upward trend along these years.
The COVID pandemic had a different effect on these companies. In the first few weeks, there was a sharp decline that caused the prices to fall. However, the companies’ performance improved very quickly as demand for the products increased to levels never seen before. As a result, Thermo Fisher Scientific Inc (TMO) price per share rose to values higher than 600 usd. This reflects the dependence of societies on the healthcare sector and the high expectations for the recovery of their lifestyles. Over the next few years, the company maintained high prices, but we can see that there was a lot of volatility.
Further analysis will be carried and as a sample we are going to use Thermo Fisher Scientific Inc (TMO).
Data
df_list <- list()
start <- "2000-01-01"
end <- "2022-12-31"
tickers <- c("JNJ", "LLY", "NVS", "PFE", "TMO", "UNH")
for(i in tickers){
df <- tq_get(i, get = "stock.prices", from = start, to = end)
df <- stocks_NAs(df)
df_list <- append(df_list, list(df))
}
# Merge datasets
df_companies <- merge_dataframes(df_list)
# Save dataframe as a new file
write.csv(df_companies, './data/viz_healthcare_companies.csv', row.names = FALSE)Visualization
df_companies <- gather(df_companies, key = "stock", value = "price", -date)
# Create ggplot line plot
viz_healthcare_index <- ggplot(df_companies, aes(x = date, y = price, color = stock)) +
geom_line() +
labs(title = "Stock Prices Over Time",
x = "Date",
y = "Stock Price") +
scale_color_discrete(name = "Stock")
# Show plot
ggplotly(viz_healthcare_index)Bus Transport
By observing the Bus Passengers plot we can observe there is no clear trend as through the years the values are very similar. Up to 2020, with COVID, the activity dropped. We can se a clear trend we will analyze later.
Data
# Import dataset
df_bus_passengers <- read_excel('./data/bus_passengers.xlsx', sheet = "UPT", .name_repair = "unique_quiet")
# Filter relevant rows
df_bus_passengers <- df_bus_passengers %>% filter(df_bus_passengers$'3 Mode' == "Bus")
# Convert dataframe from wide to long
df_bus_passengers <- df_bus_passengers %>% gather(key = "Year_Month", value = "Value", -'NTD ID', -'Legacy NTD ID', -'Agency', -'Status', -'Reporter Type', -'UACE CD', -'UZA Name', -'Mode', -'TOS', -'3 Mode')
# Select relevant columns
df_bus_passengers <- df_bus_passengers %>% dplyr::select('3 Mode', 'Year_Month', 'Value')
df_bus_passengers <- df_bus_passengers %>% separate(Year_Month, into = c("Month", "Year"), sep = "/")
df_bus_passengers <- df_bus_passengers %>% mutate(Date = paste0(Year,'-', Month, '-', "01"))
df_bus_passengers$Date <- as.Date(df_bus_passengers$Date)
# Select relevant columns
df_bus_passengers <- df_bus_passengers %>% dplyr::select('Date', 'Value')
# Filter relevant rows
df_bus_passengers <- df_bus_passengers %>%
group_by(Date) %>%
summarize(Value = sum(Value))
# Change data type
df_bus_passengers$Value <- as.integer(df_bus_passengers$Value)
# Save dataframe as a new file
write.csv(df_bus_passengers, './data/viz_bus_passengers.csv', row.names = FALSE)Visualization
names(df_bus_passengers) <- c('DATE', 'Value')
# Chage data type
df_bus_passengers <- df_bus_passengers %>% filter(year(DATE) <= 2022)
# Create plot
viz_bus_passengers <- plot_ly(data = df_bus_passengers, x = ~DATE, y = ~Value/1000000, type = 'bar') %>%
layout(
title = 'U.S. Bus Passengers',
xaxis = list(title = 'Date'),
yaxis = list(title = 'Passenger Trips (millions)'),
showlegend = FALSE
)
# Show plot
ggplotly(viz_bus_passengers)Bicycles Consumption
When analyzing the bicycles personal consumption we can observe there is an upward trend over the years. We can observe a drop in the year 2020 due to COVID pandemic. The recovery in this case was rapid and very positive for this sector of the industry.
Data
# Import dataset
df_bikes <- read_excel('./data/personal_expenditure_by_industry.xlsx', sheet = "U20405-M", .name_repair = "unique_quiet")
# Make first row the column headers
names(df_bikes) <- append(c('Line', 'Indicator', 'Code'), as.character(df_bikes[7, ])[4:ncol(df_bikes)])
# Filter relevant rows
df_bikes <- df_bikes %>% filter(Indicator == "Bicycles and accessories")
# Convert dataframe from wide to long
df_bikes <- df_bikes %>% gather(key = "Year-Month", value = "Value", -'Line', -'Indicator', -'Code')
# # Change data type
df_bikes$Value <- round(as.numeric(df_bikes$Value), 2)
# Select relevant columns
df_bikes <- df_bikes %>% dplyr::select('Year-Month', 'Value')
# Separate Year-Month
df_bikes <- df_bikes %>%
separate('Year-Month', into = c("Year", "Month"), sep = "M")
# Create DATE column
df_bikes <- df_bikes %>%
mutate(DATE = make_date(Year, Month, day = 1))
# Select relevant columns
df_bikes <- df_bikes %>% dplyr::select('DATE', 'Value')
# Chage data type
df_bikes$DATE <- as.Date(df_bikes$DATE)
# Save dataframe as a new file
write.csv(df_bikes, './data/viz_bikes.csv', row.names = FALSE)Visualization
# Chage data type
df_bikes <- df_bikes %>% filter(year(DATE) >= 2000)
# Create plot
viz_bikes <- plot_ly(data = df_bikes, x = ~DATE, y = ~Value, type = 'bar') %>%
layout(
title = 'U.S. Bicycles and accessories',
xaxis = list(title = 'Date'),
yaxis = list(title = 'Personal Consumption Expenditures (millions of dollars)'),
showlegend = FALSE
)
# Show plot
ggplotly(viz_bikes)Air Carrier Domestic and International
Description
Data
# Import dataset
df_air_transport <- read_excel('./data/personal_expenditure_by_industry.xlsx', sheet = "U20405-M", .name_repair = "unique_quiet")
# Make first row the column headers
names(df_air_transport) <- append(c('Line', 'Indicator', 'Code'), as.character(df_air_transport[7, ])[4:ncol(df_air_transport)])
# Filter relevant rows
df_air_transport <- df_air_transport %>% filter(Indicator == "Air transportation (64)")
# Convert dataframe from wide to long
df_air_transport <- df_air_transport %>% gather(key = "Year-Month", value = "Value", -'Line', -'Indicator', -'Code')
# Change data type
df_air_transport$Value <- round(as.numeric(df_air_transport$Value), 2)
# Select relevant columns
df_air_transport <- df_air_transport %>% dplyr::select('Year-Month', 'Value')
# Separate Year-Month
df_air_transport <- df_air_transport %>%
separate('Year-Month', into = c("Year", "Month"), sep = "M")
# Create DATE column
df_air_transport <- df_air_transport %>%
mutate(DATE = make_date(Year, Month, day = 1))
# Select relevant columns
df_air_transport <- df_air_transport %>% dplyr::select('DATE', 'Value')
# Chage data type
df_air_transport$DATE <- as.Date(df_air_transport$DATE)
# Save dataframe as a new file
write.csv(df_air_transport, './data/viz_air_transport.csv', row.names = FALSE)Visualization
# Chage data type
df_air_transport <- df_air_transport %>% filter(year(DATE) >= 2000)
# Create plot
viz_air_transport <- plot_ly(data = df_air_transport, x = ~DATE, y = ~Value, type = 'bar') %>%
layout(
title = 'U.S. Air Carrier Domestic and International',
xaxis = list(title = 'Date'),
yaxis = list(title = 'Personal Consumption Expenditures (millions of dollars)'),
showlegend = FALSE
)
# Show plot
ggplotly(viz_air_transport)